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Using a Genetic Algorithm to Model Broadband Regional Waveforms 
for Crustal Structure in the Western United States 

by Joydeep Bhattacharyya,* Anne F. Sheehan, Kristy Tiampo, and John Rundle 

Abstract In this study, we analyze regional seismograms to obtain the crustal 
structure in the eastern Great Basin and western Colorado plateau. Adopting a for- 
ward-modeling approach, we develop a genetic algorithm (GA) based parameter 
search technique to constrain the one-dimensional crustal structure in these regions. 

The data are broadband three-component seismograms recorded at the 1994-95 IRIS 
PASSCAL Colorado Plateau to Great Basin experiment (CPGB) stations and supple- 
mented by data from U.S. National Seismic Network (USNSN) stations in Utah and 
Nevada. We use the southwestern Wyoming mine collapse event (M h = 5.2) that 
occurred on 3 February 1995 as the seismic source. We model the regional seis- 
mograms using a four-layer crustal model with constant layer parameters. Timing of 
teleseismic receiver functions at CPGB stations are added as an additional constraint 
in the modeling. 

GA allows us to efficiently search the model space. A carefully chosen fitness 
function and a windowing scheme are added to the algorithm to prevent search 
stagnation. The technique is tested with synthetic data, both with and without random 
Gaussian noise added to it. Several separate model searches are carried out to estimate 
the variability of the model parameters. The average Colorado plateau crustal struc- 
ture is characterized by a 40-km-thick crust with velocity increases at depths of about 
10 and 25 km and a fast lower crust while the Great Basin has approximately 35- 
km-thick crust and a 2.9-km-thick sedimentary layer. 


Introduction 

Crustal structure of the Colorado plateau and the Basin 
and Range provinces of the western United States has been 
a subject of considerable debate for the last three decades. 
Several seismological data types, including receiver func- 
tions, refraction profiles, body and surface waveforms, and 
travel times, have been used to constrain the average one- 
dimensional velocity structure in these regions (Roller, 
1965: Keller et aL, 1979; Priestley and Brune, 1978: Wolf 
and Cipar, 1993; Ozalaybey et aL, 1997; Song et aL, 1996; 
Sheehan et aL, 1997). These studies have presented varying 
estimates of crustal thickness and the depth to the crustal 
discontinuities. The difference between these estimates can 
be attributed to the different spatial sampling of the studies 
coupled with various assumptions adopted in each of the 
analyses. Recent availability of high dynamic-range, digital 
broadband instrumentation enables us to greatly improve our 
ability to estimate seismic parameters from regional seis- 
mograms. These seismograms have been shown to be 
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strongly sensitive to the crust and lithospheric structure (e.g., 
Song et aL, 1996; Rodgers and Schwartz, 1997, 1998; Oz- 
alaybey et aL, 1997). Use of improved analytical and com- 
putational tools, for example, computing source parameters 
and synthetic seismograms, let us remove some of the as- 
sumptions made during the analysis and improve signal ex- 
traction. Moreover, availability of portable high-quality 
three-component seismometers allow us to rapidly increase 
the spatial sampling of our data, thereby letting us investi- 
gate unexplored regions. 

In principle, modeling that makes use of the whole seis- 
mic waveform, including both body waves and surface-wave 
modes, should have advantages over methods that use only 
narrowly selected parts of seismograms, such as arrival times 
or phase velocities. Full waveform modeling also allows us 
to use information contained in higher modes without having 
to explicitly identify those modes. This modeling technique 
does not require the station density required by body-wave 
tomography and has increased sensitivity to crustal structure 
over teleseismic phase velocities. Thus, it is a technique ide- 
ally suited for regional deployments of broadband sensors 
such as the Colorado plateau to Great Basin PASSCAL ex- 
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periment (Jones et al. t 1996). By including both radial and 
vertical waveforms in our modeling, we obtain increased 
sensitivity to 5V and reduce the chances of modeling noise 
coherent on only one channel. Finally, our complete wave- 
form modeling technique allows for phase conversions due 
to the intracrustal layers and for regional w'aveforms in- 
creases the delineation of thin crustal layers. 

In this study, we adopt a forward-modeling approach to 
constrain the crustal structure in both the Colorado plateau 
and the Great Basin provinces. A simple crustal model is 
adopted, and synthetic seismograms are computed for these 
models. We search the solution space using a large number 
of models and select the model that best fits the synthetics 
to the recorded data using some fitness measure. We adopt 
the reflectivity technique (Randall, 1994) to compute the 
synthetics. For the search algorithm, we adopt the genetic 
algorithm (GA) modeling technique in this study. Many seis- 
mological optimization problems are nonlinear and result in 
irregular fitness functions. Moreover, they can have a rough 
fitness landscape with several local minima. Thus, local op- 
timization techniques, such as linearized matrix inversion, 
steepest descent, and conjugate gradients, can converge pre- 
maturely to a local minima. In addition, the success in ob- 
taining an optimum solution can depend strongly on the 
choice of the starting model. To mitigate these problems, 
global optimization techniques that avoid these limitations 
are particularly useful in seismology. GA is such a technique 
that allows us to efficiently search the model space and con- 
verge toward a global minima. Using these tools, we have 
developed a parameter estimation technique that lets us rap- 
idly model the one-dimensional crustal structure using three- 
component broadband seismograms. 

In the following sections, we first discuss the data and 
the sensitivity of the various model parameters on the data. 
Next, we develop the GA search scheme and test our tech- 
nique w ith synthetic data w ith various levels of noise added 
in. This step allows us to estimate the robustness of our 
technique. Finally, this method is used to model the crustal 
structure in the Colorado plateau and Great Basin provinces. 


Dataset 

The seismograms used in this study come from the Col- 
orado plateau-Great Basin (CPGB) deployment of 1 1 broad- 
band portable sensors in Colorado and Utah from October 
1994 to July 1995 (Jones et ai, 1996; Sheehan et ai, 1997 ). 
Broadband recordings from nearby U.S. National Seismic 
Network (USNSN) stations Battle Mountain, Nevada (BMN), 
Kanab, Utah (KNB), Elko, Nevada (ELK), and Mina, Nevada 
( MNV), have been added to this dataset. The station locations 
are shown in Figure 1. In this study, waveforms from the 
mine collapse event in southwestern Wyoming (Pechmann 
et ai . 1995) are modeled. The seismic event ( M h = 5.2) 
occurred on 3 February 1995, 29 km west of the town of 
Green River, Wyoming (Fig. 1). This is a well-modeled 


event and is within 140 to 800 km of the CPGB stations. The 
use of a well-calibrated source minimizes the biasing effects 
of unmodeled source signature in the estimation of seismic 
structure. Pechmann et ai (1995) have concluded that the 
seismic source can be represented as a 0.5-km-deep mine 
collapse event. Therefore, with a shallow source depth and 
for regional propagation distances, the body w ; aves and the 
surface waves recorded in the early part of the seismogram 
(and used in this study) primarily travel through the crust 
and can constrain the average crustal structure. The accurate 
determination of both the source location and origin time 
allows us to estimate correctly the arrival times of the body 
and surface waves. 

Both the CPGB and the USNSN stations recorded the 
event on three components. No clear arrival is observed in 
the transverse component, primarily due to the implosional 
nature of the source. This point has also been noted by Pech- 
mann et ai (1995). Therefore, in this study, we model the 
vertical and the radial components that can be adequately 
combined to investigate high-resolution one-dimensional 
crustal structure (Song et ai , 1996). We add that, in earth- 
quake events, the technique described in this article can read- 
ily incorporate the transverse-component data and thereby 
improve the shear velocity model. 

The recorded seismograms are first instrument decon- 
volved and integrated to displacement. This is important be- 
cause it allows us to combine recordings made using differ- 
ent instrument types and also allow s us to extend the data to 
longer periods. For example, clear signals are observed up 
to 40 sec w'hile the long-period roll offs for the CPGB in- 
struments are typically at 30 sec. Next, a third-order zero- 
phase Butterworth filter is applied with a bandpass between 
10 and 50 sec. The particular frequency cutoffs are used 
primarily for two reasons: (a) the high-frequency features in 
the seismogram are not modeled in this study because they 
can strongly be affected by lateral variations of elastic prop- 
erties and thereby bias the one-dimensional model; (b) sig- 
nals at very long periods had to be removed because noise 
levels are amplified at these periods due to instrument de- 
convolution, thereby severely contaminating the signal. The 
radial- and vertical-component seismograms used in this 
study are shown in Figure 2. We can observe clearly the 
Rayleigh waves in the seismograms, and as expected for the 
collapse-source mechanism, Love waves are absent for this 
event. We have concentrated our modeling efforts on the 
Rayleigh waves because of two reasons: (a) these waves are 
the most prominent arrival observed in the data (Fig. 2) and 
are thus expected to have a large signal-to-noise ratio; 
(b) these waveforms have simple wave shapes and can be fit 
using simple ID crustal models. However, the w ; aveform 
fitting technique used in this study is capable of modeling 
the complete waveform, that is, body waves and surface 
waves. The data window is selected by marking the moveout 
velocities, isolates the Rayleigh wave (Fig. 2). These mov- 
eout velocities are also used to excise the Rayleigh wave- 
form from the synthetics, thereby preserving the duration 



204 


J. Bhattacharyya, A. F. Sheehan, K. Tiampo, and J. Rundle 



Figure 1. Location of the 3 February 1995 
mine-collapse event and the seismic stations at 
regional distances used in this study. Stations 
BMN, DUG, ELK, KNB, and MNV belong to the 
U.S. National Seismic Network (USNSN), and 
the rest of the stations are from the Colorado 
plateau to Great Basin (CPGB) PASSCAL ex- 
periment. The diamonds indicate stations used 
for estimating Great Basin crustal structure, 
and stars indicate stations used for modeling 
the Colorado plateau crust. Solid lines denote 
state boundaries, and dashed lines show 
boundaries between geologic provinces. 


and the arrival-time information when the synthetics are 
compared to the data. We use a conservative window size 
that allows for variable arrival times due to laterally varying 
seismic structure and dispersion. Typically, the extracted 
waveforms are 35 to 50 sec long. The vertical lines in Figure 
2 show the seismograms and the data window used in this 
study. 

Model Sensitivities 

Given an accurate source location and mechanism, a 
usual problem for waveform modeling is to understand the 
sensitivity of the data to the model parameters. Moreover, 
regional seismograms produced by shallow events, like the 
one used in this study, can be very complicated (e.g.. Song 
et ai, 1996). To better understand these effects, we have 
carried out simple tests using synthetic seismograms com- 
puted using different one-dimensional crustal models. Our 
goal is to identify the parameters in the model that the data 
are sensitive to and allow variations only in those parameters 
in the modeling procedure. To simulate real data analyzed 
in this study, we carried out these tests using reflectivity 
synthetics (Randall, 1994), and we used the moment tensor 
source presented for this event by Pechmann et ai, 1995. 
We have used a one-dimensional velocity model given by 
Wolf and Cipar (1993) that is derived from refraction lines 
across southeastern Utah and northeastern Arizona. The 
model, inverted from the Chinle-to-Hanksville line, essen- 
tially consists of three crustal layers of constant layer veloc- 
ity and thicknesses. We have slightly modified this model 
by adding a sedimentary layer. Simplified models are de- 
veloped by varying each of the model parameters separately, 
isolating the sensitivity of the synthetics to each of them. 


The radial- and vertical-component synthetics are filtered to 
periods between 10 and 50 sec to make them consistent with 
the recorded seismograms. Following Patton and Taylor 
(1984), the shear-wave attenuation (Q (i ) in the crust is fixed 
to 100, and we use a scaling ratio of Q a :Qp equal to 2. Fi- 
nally, the synthetics are computed for distance ranges of 200 
to 900 km, encompassing the source-to-receiver distances 
used in our study. 

We have carried out a number of experiments to esti- 
mate separately the effect of layer velocities, thicknesses, 
Poisson’s ratio, and densities. By systematically varying the 
velocities of the sediment layer, upper and lower crust, and 
upper mantle, we tested the sensitivity of the synthetic seis- 
mograms to these parameters. From these experiments, we 
find that these shallow source waveforms are strongly af- 
fected by the sediment layer and the upper crustal layer ve- 
locities, are affected by the lower crust at larger distances 
(probably due to their increased depth of penetration), and 
are insensitive to the uppermost mantle velocity. Because 
we use a fixed Vp:Vs ratio, changing F’-wave velocities ef- 
fectively modifies tfiie 5-wave velocities too. Changing layer 
thicknesses had a similar effect and therefore can trade off 
with the velocities in the fitting process. The synthetics are 
slightly affected by the density of the crust and insensitive 
to upper mantle density variations. We vary Poisson’s ratio 
between 1.73 and 1.82, thereby covering the range of pos- 
sible continental values (Rudnick and Fountain, 1995; Chris- 
tensen and Mooney, 1995; Christensen, 1996); this factor 
slightly affects the synthetics. The synthetics are sensitive to 
the total crustal thickness at distances larger than 300 km. 
We conclude from these tests that the synthetics are mostly 
sensitive to the shallow layers (approximately 0 to 30 km), 
a point also noted in Song et ai (1996). We also note that 
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Figure 2. (a) The vertical- and radial-component seismograms used in this study to estimate Colorado 

plateau crustal structure. The solid lines show the recorded seismograms that have been instrument de- 
convolved to displacement and bandpass filtered between periods 10, and 50 sec. For each station, the 
upper traces are from vertical-component seismograms, and the lower trace is radial component. The 
vertical lines indicate the portion of the seismograms used for waveform fitting, and the source-to-receiver 
distance is shown for each station. Each of the seismograms are normalized to their maximum amplitudes. 

The dashed lines show the synthetic waveforms predicted by our best-fitting one-dimensional crustal 
model for this region. Note that the recorded seismograms are slightly slower than those predicted by 
our model for stations CYF, LMP, and SRS and faster for stations WMT and RCC. (b) Similar to Figure 
2a, but for stations used in modeling the crustal structure of the Great Basin province. We note that the 
seismograms recorded at stations BMN and ELK are slightly slower and seismograms at WCP and RTS 
are slightly faster than those predicted by our average model. 


the seismograms are not sensitive to the uppermost mantle 
structure. Because we do not have a complete data coverage, 
trade-offs between model parameters can occur. To diminish 
these effects, we allow variation in model parameters within 
prescribed limits whose values are based on published re- 
sults from the study regions. Also, these trade-offs can give 
rise to several local minima; GA is capable of identifying the 


global minimum in such a situation (Goldberg, 1989). Based 
on these tests, it can be concluded that the data are adequate 
to model for four crustal layers, which includes a thin sed- 
imentary layer. However, it is stressed that the whole crust 
in the western United States is not as simple as our ID mod- 
els (Hearn et al , 1991; Wolf and Cipar, 1993; Sheehan et 
al 1997). Our goal in this study is to develop a technique 
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that can rapidly obtain an average ID regional model and is 
consistent with the relatively long-period data (10 to 50 sec). 

Use of Genetic Algorithms in Seismic 
Waveform Modeling 

In recent years, GA have been employed in the solution 
of nonlinear optimization problems in the physical sciences. 
Most geophysical modeling problems are traditionally 
solved using local optimization techniques, such as linear- 
ized matrix inversion, steepest descent, conjugate gradients, 
or grid search techniques. These techniques can sometimes 
converge prematurely to a local minima, and success can 
depend strongly on the choice of a starting model. A global 
optimization technique such as a GA mitigates these prob- 
lems and are an attractive search tool suitable for the irreg- 
ular, multimodal fitness functions typically observed in mod- 
eling seismic waveforms. Since a GA undergoes an initially 
random and progressively more deterministic sampling of 
the parameter space, these algorithms offer the possibility of 
efficiently and relatively rapidly locating the most promising 
regions of the solution space. Their ability to solve nonlin- 
ear, nonlocal optimization problems without a priori knowl- 
edge of curvature information precludes the need for deriv- 
ative computations, a particularly important feature because 
it allows for fast approximate forward modeling where no 
exact derivative information is available (Wright, 1991). Be- 
cause genetic algorithms sample the space directly, lineari- 
zation of the problem is unnecessary, thus avoiding errors 
involved in this approximation. Use of synthetic seismo- 
grams for modeling seismic waveforms, which involves the 
nonlinear interaction of model parameters, is an obvious ap- 
plication of this feature. 

Several recent studies have employed GAs to invert for 
seismic structure (Stoffa and Sen, 1991; Sen and Stoffa, 
1992; Stoffa et aL, 1994; Boschetti et aL, 1996), hypocenter 
relocation (Billings et aL, 1994), seismic phase alignment 
(Winchester et aL, 1993), fault-zone geometry (Yu, 1995), 
.mantle velocity structure (Neves et aL, 1996; Curtis et aL, 
1995; Lomax and Snieder, 1995), and crustal velocity struc- 
ture (Jin and Madariaga, 1993; Drijkoningen and White, 
1995; Zhou et aL, 1995). Our modeling of seismic structure 
in the Colorado plateau and eastern Great Basin will use a 
GA as shown in Figure 3. The program employs a random 
number generator to produce an initial set of 100 potential 
values for each of the waveform parameters, within an ini- 
tially specified range of acceptable values. These model val- 
ues are then coded as genes, which in turn are combined to 
form specific models, or chromosomes, for each member of 
the initial population of 100 potential solutions. These mod- 
els are ranked, from the best to the worst, according to a 
fitness function, which is obtained from the cross-correlation 
function computed between each synthetic and the recorded 
seismograms. We do not shift the waveforms in time to im- 
prove the fit. This procedure preserves the absolute travel 
time of the Rayleigh waves and, because the mine collapse 
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Figure 3. The flowchart of the genetic algorithm 
adopted in this study. The individual steps are dis- 
cussed in the text. 


origin time is well constrained ( Pechmann et aL, 1995), pro- 
vides an important constraint on the velocity structure. The 
models with the highest cumulative correlation coefficient 
(used in this study as a fitness measure and described later) 
are the fittest and are selected, based on their relative fitness, 
to contribute to the next generation, where the genetic opera- 
tions of crossover and mutation take place (Wright. 1991; 
Michalewicz, 1996). Between each subsequent generation of 
the GA, a crossover operation occurs, in which two randomly 
selected members of the new population swap genes based 
on a randomly generated position in the string. The parent 
models are recombined, with the left portion of one parent 
and the right portion of the other parent creating one new 
model, or offspring, of the correct length. The corresponding 
recombination on the remaining subchromosomes creates a 
second offspring. In two-point crossover, the parents are 
split at two randomly selected locations in the parent models, 
with everything to the right of the first location in one parent 
recombining with everything to the left of the same location 
in the second parent, while the second offspring is generated 
from everything to the left of the second location in the first 
parent and everything to the right of that same location in 
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the second parent (Michalewicz). Crossover is the device by 
which the random information exchange takes place — new 
members with different fitness are generated in order to 
evolve our directed search. In a GA, mutation corresponds 
to the replacement of a gene, that is, a model parameter, by 
a new, randomly generated value. Mutation is done to ex- 
plore random parts of the model space that might have been 
otherwise missed. 

Each model parameter is represented by an array of real 
values (Wright, 1991). The CPU intensive nature of the par- 
ticular fitness function employed for the waveform modeling 
and the necessity for high precision calculations makes the 
use of a real-valued GA very attractive. We use an elitist 
function in this algorithm that ensures that the best member 
of all populations, current and prior, is stored in memory and 
then copied into each current population, where it might oth- 
erwise no longer exist (Michalewicz, 1 996). To avoid search 
stagnation, a windowing function was added to the GA. Sig- 
nificant trade-offs between the crustal seismic properties can 
result in a rough fitness landscape with numerous local min- 
ima. This particular fitness landscape can generate a popu- 
lation with a small standard deviation of the population fit- 
ness after only a relatively few' generations. This can 
decrease the selection pressure toward better structures, 
causing the search to stagnate and even reach premature con- 
vergence in several instances. As a result, we created a rou- 
tine to window' the fitness by subtracting the fitness value of 
the worst member of the previous generation from every new' 
population fitness, prior to selecting the next generation. 
This ensured that those members with a better relative fitness 
were included in a greater proportion in the next generation, 
despite the small absolute difference in their fitness. 

Features of the Waveform Modeling 

Modeling Constraints. As described earlier, several recent 
studies have modeled the crustal structure of the Colorado 
plateau and the Great Basin provinces. Constraints on the 
seismic structure from these studies can greatly improve the 
search algorithm. The constraints help us primarily by (a) 
reducing trade-offs between model parameters, (b) decreas- 
ing the possibility of converging to a local minima, and (c) 
increasing the convergence rate by allowing us to consider 
a subset of the models. The following constraints have been 
used in the GA modeling. 

1. Receiver function constraints on the Ps times. The Ps 
times are defined as the travel time difference between 
the direct P and the converted S wave resulting from the 
interaction of the P wave with the Moho under a given 
station. The Ps times used here are obtained from the 
study of Sheehan et al. (1997). The Ps time for a crustal 
model can be calculated from the average P and S veloc- 
ities, crustal thickness, and an average ray parameter for 
the incoming P wave (Zandt et al. , 1995). The observed 
Ps times for stations in the Great Basin and Colorado 


plateau provinces are 4.2 ± 0.4 sec and 5.1 ± 0.4 sec, 
respectively (Sheehan etal., 1997). In this study, we limit 
our search to models that produce theoretical Ps times 
within these limits. In conjunction with average crustal 
velocity, this constraint can improve our estimate of the 
total crustal thickness. 

2. V p :V s ratio. In the GA modeling, we use a fixed V p :V x 
ratio. For the sedimentary layer, this value equals 1.73 
(Poisson's ratio — 0.25), and for the rest of the crust, we 
use 1.77 (Poisson's ratio = 0.265), following the results 
of Schneider (1997). 

3. Density. The densities for the crustal layers are calculated 
using the empirical relation p = 0.252 + 0.3788 X V p 
(Nafe and Drake. 1957). This constraint has also been 
successfully used in several recent western U.S. crustal 
studies (Wolf and Ciper, 1993). 

4. Upper mantle velocity. In a recent study, Schneider 
(1997) analyzed the surface-wave phase velocity curves 
to explicitly model for the upper mantle shear-wave ve- 
locity in this region. These phase velocity data provide 
better constraints on the uppermost mantle velocity than 
our data, and we use the Pn velocity estimate from this 
study in our modeling. Briefly, we have fixed the upper- 
most mantle P-wave velocity in our model to 7.9 km/sec 
and use an uppermost mantle 5-wave velocity of 4.46 km/ 
sec. 

Input Parameters . The following parameters are set a priori 
in the modeling algorithm and have the same value in each 
generation: 

1. Probability of crossover. We tested the convergence for 
both one- and two-point crossover schemes and found 
their rates to be similar, producing nearly the same model 
solutions. We thus use both of these variations of GA in 
our modeling. The crossover probability was set based 
on tests run on synthetic data to determine the best con- 
vergence for different crossover rates. 

2. Probability of mutation. This probability value controls 
the number of times mutation occurs in a given popula- 
tion, that is, for models in a particular generation. This 
value can be optimally set to equal l/n, the number of 
model parameters (Back, 1996). In this study, n is equal 
to 8. 

3. Range of model parameters. The algorithm chosen for 
this study randomly searches for model parameters w ithin 
a prescribed range of values. Unrealistic model parame- 
ters can lead to inordinately large model spaces, thus 
leading to slower convergence. 

The Fitness Function Value. In this algorithm, we seek the 
model that produces synthetics that most closely fit the ob- 
served data. For the fitness function of a given crustal model, 
we consider the following two measures: 
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1. The cross-correlation coefficient (CC) between the syn- 
thetic and the data (both normalized to unity) for each 
seismogram. The average CC for all the stations, using 
both vertical and radial components, is computed and 
used as the cumulative correlation coefficient (r) for that 
model. In computing this average, the seismograms are 
all equally weighted. Also, we preserve the absolute 
times of the Rayleigh waves by fixing the fitting window 
for each seismogram. 

2. The root-mean-squared (rms) misfit between the data and 
the synthetics. 

We note that both the above-mentioned fitness measures 
are based on the L2 norm (Parker, 1994). Therefore, these 
measures selectively fit the larger amplitudes in the seis- 
mogram. Thus, the use of an accurate estimate of the source 
time along with the use of the same time windows in the 
data and the synthetic, using moveout velocities, allows us 
to preserve the travel times of the Rayleigh waves in the 
fitting process. Tests with synthetic data show that the best- 
fit models, using either of the fitness values, give nearly iden- 
tical solutions. In a similar analysis, Rodgers and Schwartz 
(1998) have also reached the same conclusion. As synthetic 
tests determined that the average correlation coefficient 
showed faster convergence, we adopted this measure in our 
fitness measure. 

The GA chooses the model with the greatest fitness 
value (FV) to be the fittest member of the population. There- 
fore, FV must be calculated to be an ever-increasing function 
of the fitness measure. A simple solution is to exponentiate 
the correlation coefficient, r, and set that equal to FV. How- 
ever, in the presence of several local minima, with fitness 
values approaching the global minima, premature conver- 
gence can occur as the crossover procedure between models 
with marginally inferior fitness corrupts the algorithm. In- 
creasing the separation between the fitness values of these 
minima can be accomplished by simply multiplying the ex- 
ponent by a constant value, which effectively avoids this 
problem. We therefore chose as the fitness value 

FV = exp(30 X r ). 

The multiplier, 30, is chosen so that FV is a large number 
that does not compromise the numerical accuracy of the 
computer. 

Termination of the GA Search. As mentioned earlier, the GA 
uses random crossover and mutation to explore new regions 
of the solution space and thereby avoid getting trapped into 
such a local minima. Because there is no formal way to 
estimate the global minima using GA, identifying the gen- 
eration at which to stop the search and accept the solution 
can be difficult. The final model is usually selected by either 
terminating the GA after no improvement has been observed 
for a number of subsequent generations or when the FV is 
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larger than some predetermined value. The latter criteria re- 
quires us to make several assumptions regarding the forward 
problem and the noise levels of the data, which can be ad 
hoc. To avoid such assumptions, we choose the first criteria; 
that is, we accept the model that best fits the data for at least 
20 subsequent generations. This technique can still lead to 
the choice of a local minima. To avoid such convergence, 
we can carry out the GA with different starting conditions 
and then compare the terminated solutions (Levin and Park, 
1997). Toward this end, we compute solutions for several 
different randomizing seed values and use different crosso- 
ver schemes, that is, crossover occurring at one or two points 
in the gene. We identify the consistent features of these mod- 
els and accept the average of the models as the global min- 
ima. This approach implicitly assumes that separate GA runs 
all converge toward the global minima; tests with synthetic 
data confirm this assumption. Also, the variation of each 
parameter between different GA runs gives an estimate of 
the robustness of our selected solution. 

Testing the Modeling Algorithm 

Synthetic tests have been carried out to investigate the 
accuracy of the GA modeling. Seismograms are computed 
for a published model of the Colorado plateau (Wolf and 
Cipar, 1993) for regional source-to-receiver distances. The 
resulting synthetic waveforms are then analyzed for the 
crustal structure using the GA procedure outlined in the pre- 
vious section. The best-fit model is then compared to the 
input model to estimate the accuracy of our modeling al- 
gorithm. The seismograms are computed for source-receiver 
distances representative of this study, that is, between 150 
and 850 km, and we use an implosion at a depth of 0.5 km 
as the seismic source. For consistency, the synthetic is pro- 
cessed similar to the data. The tests were carried out using 
both noise-free synthetics and with seismograms containing 
additive Gaussian noise. In the following sections, the effi- 
cacy and the resolving capability of the modeling algorithm 
in obtaining the input model from the data is discussed. 

Input Model We use the Chinle-Hanksville, Utah, model of 
Wolf and Cipar (1993). The model consists of three crustal 
layers of constant thickness and includes a mid-crustal re- 
flector. This model includes slight velocity gradients in each 
layer. Velocity gradients within layers are not modeled in 
our study, and so we use the average layer velocity as a 
representative value. We also add a sedimentary layer to our 
input model. Thus the input model (IP) consists of sedimen- 
tary, upper, middle, and lower crustal layer thicknesses of 
1.5, 10.0, 20.0, and 18.0 km, respectively, with correspond- 
ing P - wave velocities are 4.0, 6.0, 6.25, and 6.95 km/sec, 
respectively (Fig. 4). 

One of the important set of input parameters in the mod- 
eling algorithm is the selection of limiting values of the 
model space. We have used conservative estimates in these 
synthetic tests: ±0.25 km/sec of the input values for the 
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Figure 4. Results from synthetic test of the mod- 
eling algorithm developed in this study. We show the 
best-fit inverted models computed using seismograms 
constructed from the given input model. Two sets of 
seismograms, that is, with and without additive noise, 
are inverted several times each using different ran- 
domizing seed values and crossover schemes. We 
show- the average models in this figure. Note that most 
of the model parameters are retrieved to within ± 5% 
of their input values. 
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velocities and ±25% of the expected layer thickness are 
chosen and the GA randomly selects models between these 
range of values. We add that, typically in the western United 
States, the crustal properties can be estimated more accu- 
rately than these limits. We also use the synthetic experi- 
ments to identify the values of crossover and mutation prob- 
abilities that can give us a fast convergence to the expected 
model. Several runs are carried out for both noise-free syn- 
thetics and those with additive noise, and the final models 
are compared. We find that GA runs with mutation proba- 
bilities between 0. 1 and 0.2 give us the best and nearly iden- 
tical solutions. We choose a mutation rate of 0.125 for all 
our modeling experiments. This number is equal to (1 fn) y 
where n is the number of model parameters and is expected 
to give us a more efficient search algorithm (Back, 1996). 
Crossover rates varying between 0.55 and 0.95 are tested. 
Though higher crossover rates lead to faster convergence 
(Goldberg, 1989), a rate of 0.95 leads to a premature con- 
vergence to a model that is not acceptably close to the so- 
lution. These tests suggest that a crossover probability value 
of 0.85 gives us the most efficient algorithm. 

Test with Noise-Free Data . In this experiment, the com- 
puted synthetics are modeled to obtain the best-fitting output 
model (OP). In the absence of noise and given a long enough 


run time, we expect the solution to ultimately converge to 
IP because the correlation coefficient for this model should 
be equal to 1 . Due to various factors that include the inherent 
nonlinearity of the forward problem, the trade-offs between 
model parameters, and finite computing resources combined 
with the fact that the convergence rate of GA gets progres- 
sively slower as it nears the solution, we choose a solution 
(OP) that is reasonably close to IP after no improved model 
has been generated in 20 consecutive generations. The 
model we chose is generated in the 44th generation and has 
layer thicknesses of 1.49, 9.7, 20.17, and 19.03 km and ve- 
locities of 3.85, 5.96, 6.18, and 6.92 km/sec from the shal- 
lowest to the deepest layers, respectively (Fig. 4). A com- 
parison between models OP and IP show r that our modeling 
technique is capable of retrieving the input parameters to 
within 1 to 4% accuracy. The thickness estimate of the deep- 
est layer is off by up to 5.5%. This is not unexpected given 
the lack of sensitivity of our dataset to the deeper layers. 
Figure 5 shows the convergence of the model parameters 
toward the final solution. We can infer that the GA with 
noise-free synthetics converges to the solution adequately 
by the 25th generation. 

Test with Noisy Data. For this test, Gaussian random noise 
is added to the “data” seismograms. A signal-to-noise ratio 
of 10, approximately equal to the SNR observed in the re- 
corded seismograms, is used. Using this realistic example, 
we can estimate the effects the trade-offs between model 
parameters and convergence to a global minimum in the 
presence of multiple local minima. The GA input parameters 
are chosen as before with probabilities of crossover and mu- 
tation fixed at 0.85 and 0.125, respectively. The GA ran- 
domly chooses models where the velocities are allowed to 
vary to within 0.25 km/sec of input values and the thick- 
nesses can be varied by up to 25%. We carried out 1 1 dif- 
ferent GA runs using different random model generators (by 
changing the randomizing seed value in the GA) and also by 
using both one-point and two-point crossover schemes. Fig- 
ure 6 shows representative models obtained using GA with 
noisy data. Note that the models all have similar features and 
are close to the average model. Figure 4 shows the final 
model for this test, which is the average of the individual 
models generated from different GA runs. We use the vari- 
ability of the output model parameters to estimate the errors 
in these values. The average model has layer thicknesses 
equal to 1.57 ± 0.07,9.88 ± 0.39, 19.46 ± 0.38, and 19.23 
± 0.36 km and velocities equal to 4.01 ± 0.04, 6.04 ± 
0.02, 6.21 ± 0.01, and 6.97 ± 0.03 km/sec, respectively, 
from the shallowest to the deepest layers. We accept the 
standard errors as rough accuracy estimates on our estimated 
model parameters. 

Modeling of Recorded Seismograms 

In this section, we present our best-fit estimates of the 
one-dimensional crustal structures of the Colorado plateau 
and Great Basin provinces of the western United States. The 
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Figure 5. P-wave velocity, thickness, and wave- 
form fitness as a function of generation number for a 
genetic algorithm run with noise-free data (described 
in text). The crustal model that best fits the synthetic 
seismograms for an input model (Fig. 7) is plotted for 
each generation. We note that adequate convergence 
is reached by the 25th iteration. A one-point crossover 
scheme is adopted for this run with the crossover 
probability equal to 0.85 and the mutation probability 
equal to 0.125 (described in text). 



crustal structure in these regions are essentially three-di- 
mensional, and the models presented in this study should be 
considered a spatial average over the sampled regions. Note 
that the model parameters along with their error estimates 
are dependent on the parameterization and the range of their 
allowable values. As described earlier in the text, values 
from published literature along with results from sensitivity 
tests justify our initial values. Though modeling small sub- 
sets of data can give us improved fits, we consider all of the 
stations in a region together to improve spatial averaging. 

Crustal Structure of the Colorado Plateau. Seismograms 
recorded at stations CYF, KNB, LMP, MDW, RCC, SRS, and 
WMT are used to model the crustal structure of the Colorado 



Figure 6. Testing the accuracy of the modeling al- 
gorithm in the presence of noise. Noise with signal- 
to-noise ratio of 10.0 (typical for the Rayleigh wave- 
forms used in this study) is added to the synthetic 
seismograms computed for the input model (see text). 
Models generated in 11 G A runs with different ran- 
domizing seed values and with both one-point and 
two-point crossover schemes are show n. The models 
are similar though the variability between the model 
values increases for velocities and thicknesses for the 
deeper layers. 


plateau. The moveout windows used to excise the Rayleigh 
waveforms vary from [2.0 5.0] km/sec for RCC at short 
source-receiver distance to [2.55 3.3] km/sec for the farthest 
station, KNB. The layer thicknesses are allowed to vary be- 
tween [ 1.0 3.0], [8.0 14.0], [15.0 25.0], and [14.0 22.0] km 
and the P-wave velocities between [3.0 5.0], [5.75 6.25], 
[6.1 6.48], and [6.7 7.0] km/sec, respectively, for the shal- 
lowest to the deepest layers. 

Our model, which is constructed from a sparse distri- 
bution of stations, can be compromised by inadequate spatial 
sampling. To resolve this issue, we have also carried out a 
number of experiments by deleting one or more stations and 
estimate the best-fit model. A number of runs are carried out 
for each subset of the dataset, and the average model from 
each of these runs is given in Table 1. We performed three 
separate sets of GA modeling by (a) removing stations RCC 
and KNB, located closest and farthest to the source, respec- 
tively; (b) removing the data recorded at station RCC that 
can be biased by high mountains and deep basins located 
close to the station; and (c) removing stations RCC, WMT, 
and MDW, where a portion of the source-receiver path does 
not lie on the Colorado plateau. We conclude from these 
experiments (Table 1) that the transition zone structure and 
the complicated seismic anomalies in northwest Utah do not 
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Table 1 

Crustal Models for the Colorado Plateau Region 



Dataset 1 

Dataset 2 

Dataset 3 

Dataset 4 


Layer 

Using All 
of the 
Colorado 
Plateau Stations 

RCCIs 
Removed 
from List 

RCC and 
KNB Are 
Removed 

Stations KNB. 
LMP. SRS. 
and CYF 

Colorado Plateau Model 


Various Seed 

Various Seed 

Various Seed 

Various Seed 

Average of 12 GA Runs 


h V> 

h V P 

h V,. 

h V P 

h V r 


Sediments 

1.5 

3.04 

1.5 

3.07 

1.4 

3.1 

1.57 

3.01 

1.5 ± 0.04 

3.05 ± 0.02 

Upper crust 

8.4 

5.76 

8.3 

5.77 

8.3 

5.9 

8.40 

5.76 

8.3 ± 0.05 

5.78 ± 0.01 

Middle crust 

15.2 

6.46 

15.3 

6.45 

15.4 

6.4 

15.25 

6.48 

15.3 ± 0.06 

6.44 - 0.01 

Lower crust 

14.3 

6.99 

14.6 

6.99 

14.3 

6.95 

14.30 

6.99 

14.4 ± 0.1 

6.98 ± 0.002 


We show the average models for four separate inversions using subsets of the dataset, using different randomizing seed values, and two different crossover 
schemes for each dataset. The final Colorado plateau crustal model, which is an average of the separate models generated in each of the runs, is shown in 
the last column of the table. In this table, h signifies thickness in km. and V P signifies P-wave velocity in km/sec. The uncertainty in the final model is for 
this parameterization only and does not cover all possible crustal models. 


bias our one-dimensional model. Between the different mod- 
els, the layer thicknesses are consistent to within 0.2 km, and 
the individual layer velocities vary by less than 0.12 km/sec. 
As described earlier, these values are smaller than the as- 
cribed error bars for the model parameters and indicate that 
each of the solutions are close to the global minima. The 
average of these models is accepted as the one-dimensional 
crustal model for the Colorado plateau (Table 1 ). The largest 
error is in the estimation of thickness of the lower crust. The 
primary features of this model are (a) an average crustal 
thickness of 39.5 km; (b) thin sedimentary layer 1.5 km 
thick; (c) velocity increases of more than 0.5 km/sec at 
depths of 10 and 25 km; (d) a fast lower crustal velocity of 
7.0 km/sec, and (e) absence of mid-crustal low-velocity 
layer. 

Our crustal thickness value, which is based on the Ray- 
leigh-wave fits, receiver function crustal P and S times, and 
a fixed upper mantle velocity, is on the lower end of esti- 
mates for this region. Recent receiver function results of 
Sheehan et ai (1997) indicate an average crustal thickness 
of 42 km in the Colorado plateau (assuming an average 
crustal P-wave velocity of 6.3 km/sec and a Vp:Vs of 1.73). 
Analysis of refraction data suggest an average Moho depth 
of about 41.5 km (Roller, 1965). Surface-wave dispersion 
studies by Keller et al (1976, 1979) and Schneider (1997) 
indicate a crustal thickness of 40, 45, and 43.5 km, respec- 
tively. A reinterpretation of Roller's data by Wolf and Cipar 
(1993) gives a crustal thickness of 48 km, which is not con- 
sistent with our results. 

Figure 7 shows a comparison of our crustal models with 
several published in the literature. Our model most closely 
resembles the model presented by Keller et al. ( 1 976), which 
was constructed using surface waves in a region of the north- 
ern Colorado plateau similar to ours. Our sedimentary layer 
thicknesses and velocities are in excellent agreement with 
those presented in studies that have explicitly modeled this 
layer (Keller et al., 1976, 1979; Roller, 1965). The presence 
of an upper crustal velocity increase near 10 km depth is 
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Figure 7. Comparison of our final Colorado pla- 
teau model with published models of one-dimen- 
sional crustal structure of the same region. The error 
bounds of our preferred model are given in Table 1. 
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consistent with the results of Wolf and Cipar (1993) and 
Keller et al. (1976). We also observe a prominent mid- 
crustal velocity increase at approximately 25 km depth. This 
layer has been observed by Roller (1965) and Prodehl and 
Lipman (1989). Published values for this velocity increase 
range from 27 to 29 km (Allmendinger et al., 1986, from an 
analysis a COCORP line in Northern Colorado plateau), 30 
km (Wolf and Cipar, 1993), 24 km (Keller, 1976), and 29.5 
km (Schneider, 1997). We observe a slower upper crustal 
velocity compared to those reported in the literature. This 
might be because our paths sample more of the slower ve- 
locity sedimentary basins in the northwestern Colorado pla- 
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teau than the other studies. We observe a very high lower 
crustal velocity that is consistent with results from other 
studies of the Colorado plateau crustal structure, for exam- 
ple, 6.8 km/sec (Roller, 1965), 6.8 km/sec (Keller et ai, 
1976), 6.95 km/sec (Wolf and Cipar, 1993), and 7.0 km/sec 
(Schneider, 1997). This can explain the unusually small im- 
pedance contrast observed for the Moho in a receiver func- 
tion study of the CPGB stations (Sheehan et ai, 1997). Also, 
the presence of a relatively thin crust under the high standing 
Colorado plateau lends support to the hypothesis that the 
high topography of this region is not completely compen- 
sated by the crust (Sheehan et ai, 1997). 

Crustal Structure of the Eastern Great Basin . The crustal 
structure of the eastern Great Basin province is modeled us- 
ing data recorded at stations BMN, ELK, GAR, MLC, MNV, 
RTS, and WCP (Fig. 1). The ray paths sample transects 
through northern Utah and eastern Nevada. As described 
earlier, we prescribe a range of values from which the GA 
randomly chooses the model parameters. For the Great Basin 
model, the ranges are [1.0 6.0], [4.0 12.0], [8.0 12.0], and 
[8.0 16.0] km for the thicknesses and [2.5 4.5], [5.8 6.2], 
[6.1 6.5], and [6.7 7.1] km/sec for the velocities for the shal- 
lowest to the deepest layers in the model. 

Though using the radial- and vertical-component seis- 
mograms for seven stations improves the spatial sampling 
of the Great Basin region, lack of crossing paths can bias 
the model search. This is because we are essentially solving 
for the average crustal structure without complete coverage 
in a strongly heterogeneous region. For example, the path- 
to-station WCP crosses the Great Salt Lake Basin, and Ray- 
leigh waves can be strongly affected by this large sedimen- 
tary feature. Moreover, significant portions of the paths to 
the stations ELK, GAR, and WCP traverse the transition zone 
between the Colorado plateau and Great Basin provinces, 
which has a somewhat different seismic structure than the 
Colorado plateau and Great Basin provinces straddling it 
(Sheehan et ai, 1997). Thus, the best-fit models need to be 
carefully appraised before we accept the average model. To- 
ward this end, we have computed separate GA runs with the 
stations that have paths mostly in the eastern Great Basin 
province, that is, BMN, RTS, MLC, and MNV. These models 
are then compared to those computed for all seven stations. 
In Table 2, we present the best-fit models. Comparing mod- 
els computed from the whole dataset and a subset of the data, 
we note that they are essentially similar though the structure 
of the sedimentary layer differs. The model computed from 
the subset of the data consists of a thicker sedimentary layer 
with faster P - wave velocity. This is most probably a reflec- 
tion of the older basin structures present in northeastern Ne- 
vada. We combined the individual models from all the runs 
and average them for the final Great Basin model (Table 2 
and Fig. 8). This model consists of a 35.0-km-thick crust, 
which is the same as that obtained by Priestley and Brune 
(1978) and is close to the values presented by Schneider 
(1997) and Keller ( 1975), 31.5 km and 29 km, respectively. 


Table 2 

Crustal Models for the Great Basin Province 


Dataset 1 Dataset 2 


Layer 



Using 


Using All 

Stations 


of the 

BMN. RTS. 

Eastern Great Basin 

Great Basin 

MLC. and 

Crustal Model 

Stations 

MNV 




Various Seed 

Various Seed 

Average of 7 Different 
GA Runs 

h 


h 

Vp 

h 


v> 

Sediments 

2.8 

3.0 

3.4 

3.7 

2.9 ± 0.3 

3.1 

± 0.2 

Upper crust 

8.1 

6.1 

7.1 

6.1 

8.0 ± 0.8 

6.1 

± 0.02 

Middle crust 

11.0 

6.4 

11.7 

6.4 

11.1 ± 0.3 

6.4 

r 0.03 

Lower crust 

13.0 

6.7 

13.1 

6.8 

13.0 ± 0.7 

6.8 

± 0.02 


We show the final models for two separate runs: (a) the vertical- and 
radial-component seismograms for all of the Great Basin stations are mod- 
eled (dataset 1); (b) when stations for which the seismic paths are mostly 
in the Great Basin province are modeled (dataset 2). The models are similar 
and the individual models from each run is averaged to produce our Great 
Basin crustal model, which is shown in the last column of the table. In this 
table, h signifies thickness in km and V P signifies P-wave velocity in km/ 
sec. The uncertainty in the final model is for this parameterization only and 
does not cover all possible crustal models. 
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Figure 8. Comparison between our final Great Ba- 
sin model with several published models of one-di- 
mensional crustal structure of the region. Compared 
to others, our model shows a slightly thicker sedi- 
mentary layer and higher lower crustal velocity. We 
do not require the crustal low-velocity zone as re- 
ported by Keller et ai (1976). Uncertainties in our 
preferred model are given in Table 2. 
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Our model indicates a 2.9-km-thick sedimentary layer with 
a P-wave velocity of 3.45 km/sec. This is thicker than the 
shallow layer obtained by Keller et ai (1979) but is com- 
parable to the 2.5-km-layer presented by Priestley and Brune 
(1978) and Song et ai (1996). Our upper crust and middle 
crust layers have an average velocity of approximately 6.1 
and 6.4 km/sec, respectively. The average velocity of these 
two layers of 6.2 km/sec is similar to the values obtained by 
Priestley and Brune (1978), Song et ai (1996), and Schnei- 
der (1997). The lower crustal layer has a P - wave velocity of 
6.8 km/sec. This value is similar to that obtained by Priestley 
and Brune (1978) but is higher than the 6.4 km/sec obtained 
by Song et ai (1996) and Schneider (+997). We note that 
the Rayleigh waves arrive later than predicted at stations 
BMN and ELK, indicating that the crustal velocities are 
slower or the crust is thicker in the northern Great Basin 
province, compared to our average model. Finally, we note 
that the fit to data for the Colorado plateau seismograms is 
comparatively better than those that traverse the Great Basin 
province and is probably due to the strong lateral variations 
of velocity observed in the latter region. 

Conclusions 

In this study, we have developed a technique to search 
the parameter space efficiently to model regional broadband 
seismograms for crustal structure. This technique is based 
on the GA procedure that generates random models consis- 
tent with an allowable set of parameter values. Using an 
accurate estimate of the source mechanism and location, 
these models are used to compute synthetic seismograms 
that are then compared to the recorded data, and the best- 
fitting model is estimated. This technique assumes that the 
seismic source model is available, though this is not a nec- 
essary condition for this technique. Because having more 
free variables leads to longer run times, we vary only eight 
parameters in our search procedure, that is, the thickness and 
P-wave velocities of four crustal layers. Each GA run is car- 
ried out with different randomizing seed values and also with 
different crossover and mutation schemes. This process al- 
lows us to avoid converging to a local minima. 

Using this technique, we have modeled the Rayleigh 
waveforms recorded at the CPGB and USNSN stations for 
separate models of one-dimensional crustal structure in the 
Colorado plateau and the Great Basin provinces of the west- 
ern United States. For each of these regions, several inde- 
pendent runs are carried out with subsets of each dataset to 
estimate the robustness of the final models. The average of 
the models estimated from each of these runs is accepted as 
the crustal model for these regions. These one-dimensional 
models are consistent with other published results of similar 
structure in these regions. The crust is about 5 km thicker 
under the Colorado plateau compared to the Great Basin, 
whereas the latter region has a thicker sedimentary layer. 
Also, our model indicates a very high lower crustal velocity 
in the Colorado plateau region. 
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